New insights into the phylogeny of Carasobarbus Karaman, 1971 (Actinopterygii, Cyprinidae) with the description of three new species

Fishes from the genus Carasobarbus, widely distributed throughout the river systems of North Africa and West Asia, are commonly referred to as Himris. In the Persian Gulf basin, they are widespread and are also found in fast-flowing rivers or the deeper regions of lakes. In this region, representation of these fishes in scientific collections is scarce, and except for C. luteus, the other species are very poorly documented and frequently misidentified due to their similarities. In this study we analysed the relationships among Carasobarbus species using mitochondrial genes (Cyt b, COI) and present morphological characters based on examinations. Our results revealed three new species which we describe here. Carasobarbus doadrioi, new species, is distinguished by 40–44 scales on the lateral line and a prominent black blotch on end of caudal peduncle in specimens < 85 mm SL. Carasobarbus hajhosseini, new species is distinguished by 32–34 scales on the lateral line and long head length (20–24% SL). Carasobarbus saadatii, new species, is distinguished by 38–40 scales on the lateral line and short head length (19–20% HL). In the Persian Gulf basin, Carasobarbus species exhibit uncorrected genetic distances of 1.6 to 5.5% in the COI barcode region and 2.6% to 9.9% in the Cyt b gene. This study highlights the importance of investigating the unexplored diversity that exists within poorly sampled and understudied freshwater fish group. Such investigations are essential for developing a comprehensive understanding of the true extent of biodiversity, which is critical for informing effective conservation and protection strategies.


Molecular data analysis
The obtained sequences and the ones downloaded from GenBank (Tables 1, 2), were aligned using MAFFT 12,13 as implemented in Geneious v. 10.0.2 (Biomatters, http:// www.genei ous.com/).The obtained datasets were concatenated in Geneious to create three different datasets: COI dataset, Cyt b dataset, and the concatenated dataset.In the case of the concatenated dataset, in the ingroup, we only kept the samples with both genetic markers amplified from the same specimen.This was not possible for the outgroups as none of the sequences in Genbank, used for outgroups, came from the same specimen for both genes.In these cases, sequences from unrelated specimens were concatenated together.This does not affect the phylogenetic results of the ingroup.To determine intraspecific species uncorrected pairwise genetic distances (p-distances) (Tables 3, 4), we employed Mega 6 14 .
Both maximum likelihood (ML) and Bayesian (BI) methods have been used to construct phylogenetic relationships of the group.In the case of ML approach, IQ-TREE 1.6.12 15,16were used.In this case, the optimal substitution model and the best partitioning scheme based on the codon information, was investigated using ModelFinder 17 with the Bayesian information criterion (BIC).In the case of single marker datasets, the codon position information was provided, and in the concatenated dataset both codon position and gene separation were provided to the program.The bootstrap (− b 500) approximations was used to calculate support values 18 .FigTree 1.4.4 (http:// tree.bio.ed.ac.uk/ softw are/ figtr ee/) was used to visualize the resulting trees.In the case of the BI approach, MrBayes 3.2.7 19 were used with two parallel simultaneous analyses for 2 × 10 7 generations, each with four MCMC chains, and sampling every 2000 generations.The initial 25% of generations were discarded as the burn-in.An rjMCMC 20 approach was implemented using the nst = mixed command.The proper convergence of the runs was verified using Tracer 1.7 21 .
Three distance-based molecular species delimitation methods were used: automatic barcode gap discovery (ABGD) 22 , assemble species by automatic partitioning (ASAP) 23 , and Bayesian Poisson Tree Processes model (bPTP) 24 .The ABGD analysis were performed on its online webserver (https:// bioin fo.mnhn.fr/ abi/ public/ abgd/ abgdw eb.html), exploring a range of ABGD settings with a parameter range of Pmin = 0.001, Pmax = 0.1, and a gap width of 1.5 over ten steps.The ASAP analysis was also made, using Simple Distance (p-distances), via its web interface (https:// bioin fo.mnhn.fr/ abi/ public/ asap/ asapw eb.html).The bPTP analysis was run only on the in-group on the online implementation of it (https:// speci es.h-its.org/) using default settings.

Results
We were able to generate 38 new sequences (22 COI + 16 Cyt b) for six species of Carasobarbus from Iran, Iraq and Türkiye, in addition to 173 sequences from NCBI GenBank (Tables 1 and 2).The final alignment for COI consisted of 770 base pairs, with 676 positions being constant, 88 being parsimony informative and 6 being singletons (calculated just between in-group species), and for Cyt b the alignment was 1143 base pairs, with 872 positions being constant, 240 being parsimony informative and 29 being singletons (calculated just between in-group species for both genes).
The COI gene of Carasobarbus displayed an interspecific uncorrected-p genetic distance of 1.6% between C. luteus and C. chantrei as well as C. doadrioi sp.n., C. saadatii sp.n. and C. chantrei to 5.5% between C. sublimus.Average intraspecific distance for Carasobarbus species was 0.20%, ranging from 0.0 in C. canis, C. hajhosseini, and C. kosswigi to 0.72% in clade 1 of C. fritschii/harterti species group (Table 3).
For the Cyt b gene, the genetic distances between species ranged from 2.6% between C. luteus/apoensis, C. chantrei and C. exulatus to 9.9% between C. harterti and C. chantrei as well as between C. hajhosseini, C. fritschi and C. harterti.Also, the average intraspecific distance was 0.30%, ranging from 0.04% in C. canis and C. harterti to 0.66% in C. fritschii.Table 4 shows the genetic distances between and within the Carasobarbus species for Cyt b gene.
The general topology of Cyt b, COI and concatenated dataset trees (Figs. 2, 3 and 4) were in agreement with previously published phylogenies that focused on the genus Carasobarbus 8,25 .The COI and Cyt b dataset both resulted in acceptable trees with some nodes which were harder to resolve (not well supported).The increased sampling size, in the case of individual gene datasets, appears to improve the result compared to the prior phylogenetic works.The concatenation of the two genetic markers resulted in the best resolved tree even though the number of represented species was reduced.In general, all species analysed in any of the datasets was recovered as monophyletic apart from C. harterti and C. fritschii in the COI dataset.In this case, the resolution of the COI dataset for this part seems to not be adequate, and some samples identified as C. harterti are placed with C. fritschii and vice versa.www.nature.com/scientificreports/two pair of barbels (vs.one pair), well-developed median lobe on the lower lip (vs.without median lobe) and more scales on the lateral line (40-44 vs. 25-30) (Table 5).Carasobarbus hajhosseini sp.n. is similar to C. kosswigi but can be distinguished by slightly developed lower lip lobe (vs.well-developed), shorter head (20-24 vs. 24-27% SL), shorter posterior barbel (13-20 vs. 21-38% HL) and shorter snout (25-31 vs. 36-44% HL).

Description
Also, the new species can be distinguished from C. luteus by having two pair of barbels (vs.one pair), welldeveloped median lobe on the lower lip (vs.without median lobe) and more scales on the lateral line (32-34 vs. 25-30).

Coloration
In fresh: Body silverish or cream-white.The back darker than the belly.Upper lateral line scales outlined by dark pigmentation, evident in anterior and fade in posterior.Fins with scattered dark melanophores on rays and membranes.In formalin: Body cream-brown, back darker than belly.Upper lateral line scales outlined by dark pigmentation, prominent in anterior section, fades towards posterior.

Distribution
The new species is known from the Gamasiab, Kahman, Kashkan and Seymareh in Karkheh drainage.

Etymology
The species is named in honour of Haj Hossein Javadi Pour (HHJP), who is the father of the first author of this study (AJR).

Habitat
Carasobarbus hajhosseini is commonly found in the deep, swiftly flowing sections of rivers and dam reservoirs (Fig. 14).It typically favours areas with abundant vegetation, and during the summer, it can also be observed in shallower waters.Generally, the species is most abundant in the middle and lower Karkheh drainage.Luciobar  The new species can be distinguished from C. luteus (Fig. 22) by having two pair of barbels (vs.one pair), well-developed median lobe on the lower lip (vs.without median lobe) (Fig. 23) and more scales on the lateral line (38-40 vs. 25-30).

Coloration
In fresh: Body silverish or cream-white.The back darker than the belly.Upper lateral line scales outlined by dark pigmentation, evident in anterior and fade in posterior.Fins with scattered dark melanophores on rays

Distribution
The new species distributed in the lower Karun drainage as well as the Great Zab in the Tigris drainage.

Habitat
The new species is usually found in the deeper parts of rivers and dam reservoirs, where water flows are slower and there is ample vegetation and cover (Fig. 24).During the summer months, it disperses into faster-flowing waters as well, likely due to warming water temperatures in their typical habitat.It prefers areas along the banks and around islands where tree roots and aquatic plants are accessible.This allows it to forage while remaining hidden among the vegetation to avoid predators.The species appears to be most abundant in the middle and lower Karun.Luciobarbus barbulus (Heckel, 1847), Capoeta aculeate (Valenciennes, 1844), Garra rufa, Chondrostoma regium, Alburnus sellal, Squalius lepidus, Squalius berak and Glyptothorax cous, were found coexisting with the new species.

Discussion
In general, fishes of the genus Carasobarbus are bottom feeders, with morphological characters specialised for such behaviour.This is especially visible in the differences in the development of their mouth structure and lips.Similar developments have been observed in other species of barbs 26,27 .The lips development in Carasobarbus fishes, seems to be a suitable character to separate species 28 .In the newly described species, the C. hajhosseini species present the smaller lips (less developed).On the other hand, C. doadrioi species, appear to show the most developed lips among them.Check the ventral head view figure (Fig. 23) to compare these differences and observe that both latter mentioned species show both ends of the spectrum.In general, nearly all the internal nodes are well resolved in all three datasets (COI, Cyt b and concatenated datasets) used in molecular phylogenetic analyses.But as expected, the concatenated dataset resulted in the best resolved tree.Both genetic markers used in the concatenated dataset are mitochondrial markers, i.e. they sare the same evolutionary history.This point out that the improvement in the phylogenetic resolution is most probably www.nature.com/scientificreports/due to the increment in the phylogenetic signal coded in a longer sequence fragment.This point underlines the importance of including multiple markers to be able to resolve remaining obscure relationships within the genus.On the other hand, being hexaploid, complicates the inclusion of any nuclear marker in any genetic study in near future 29 .This point is important as some species of the genus (for example C. luteus) is widespread in a variety of habitats and therefore will not be surprising to find that different populations does not share the same evolutionary history.This will not be visible without analysing both mitochondrial and nuclear genomic markers.
In the obtained mitochondrial phylogenetic results in the actual study, the only unresolved relationship, is the one between C. doadrioi, C. hajhosseini and C. saadatii.The very short internal branch at this level, when present, shows a potential rapid speciation event, resulting in small number of conserved changes to resolve this relationship.In our results, based on the partial COI gene, two clearly separate clades are formed with both containing sequences identified as C. harterti and C. fritschii.This is most probably the result of misidentification, or also it can be due to introgression events.As we do not have access ourselves to the material used in this case (genetic material was retrieved from GenBank), we cannot further develop on this and corroborate the identity of each of the clades.On the other hand, using other individuals identified as these two species, they do separate well in the results of the cyt b gene dataset, with no further issues.Another possible issue which will need further investigation is the inclusion of samples identified as C. apoensis within the C. luteus clade, with practically no genetic difference with them.This point was also mentioned in Borkenhagen 28 .Based on this observation we recommend a systematic revision of both C. apoensis and C. luteus in further studies.

Fig. 2 .
Fig. 2. Phylogenetic tree of Carasobarbus based on the maximum likelihood and Bayesian analyses of the mitochondrial COI barcode region.Numbers present at each node are bootstrap/posterior probability support values.The result of the three different species delimitation methods is shown using the vertical bars.

Fig. 3 .
Fig. 3. Phylogenetic tree of Carasobarbus based on the maximum likelihood and Bayesian analyses of the Cyt b gene.Numbers present at each node are bootstrap/posterior probability support values.The result of the three different species delimitation methods is shown using the vertical bars.

Fig. 4 .
Fig. 4. Phylogenetic tree of Carasobarbus based on the maximum likelihood and Bayesian analyses of the mitochondrial COI barcode region and the Cyt b markers concatenated.Numbers present at each node are bootstrap/posterior probability support values.
Carasobarbus saadatii species also present intermediate lips development similar to C. sublimus for example, but we do not have an acceptable picture to show in this work.Borkenhagen and Krupp 2 questioned the locality data of the C. sublimus specimen (CMNFI 1979-0277), as the morphometric and meristic characters (scales in the lateral line, above the lateral line, and around the least circumference of the caudal peduncle; length of the dorsal, pectoral, ventral, and anal fins) of this specimen are within the range of C. sublimus and outside the range of C. kosswigi.This discrepancy is unsurprising because the Karkheh population belongs to C. hajhosseini, and the range of these characters matches the locality mentioned for this voucher specimen.However, they considered C. hajhosseini populations as C. sublimus, and C. doadrioi and C. saadati as C. kosswigi, which caused the range of morphometric characters to expand and positioned C. kosswigi and C. sublimus as paraphyletic in the phylogenetic trees.

Table 5 .
Lateral line scale count.Significant values are bold.

Table 7 .
Morphometric data of C. saadatii sp.n.(holotype BIAUBM 8-H and paratypes AJRPC 24-P; n = 5) and C. sublimus (VPFC Zard 1400.9., VPFC Fahlian 1400.10;n=8) and C. kosswigi (FFR 416, FFR 417, FFR 421; n = 17).EtymologyThe species is named in honour of Mohamadali Saadati (Mashhad), acknowledging his significant contributions to the taxonomy of freshwater fishes in Iran.He holds the distinction of being the first Iranian Ichthyologist, conducting a systematic study on the taxonomy and distribution of freshwater fishes in Iran in 1977.To this day, his findings continue to be utilized by several Ichthyologists in Iran.